Integrating Talend Big Data Batch Jobs with MongoDB
Category |
|
Prerequisites |
Talend Big Data Basics, Talend Big Data - Spark Batch, Knowledge of Apache Spark, MongoDB collections, MongoDB query language and aggregation, Docker containers |
Third-party software |
Apache Spark, MongoDB, Docker |
Description
|
Talend offers different components and approaches that make it easier to create collections and articulate process queries to extract information from MongoDB. Using Earthquake data files publicly available on the INGV Italian web site, the solution template builds a Talend Spark Big Data Batch Job to demonstrate a real use case. It shows you how downloaded data is pushed to a single collection, then prepared and reused for analysis. |